This chapter describes problems you might experience while using the LAN Network Manager program and suggests how to resolve these problems. Refer to the /usr/CML/lpp.README file for additional information that might help you resolve problems.
If you experience problems with the LAN Network Manager program, first attempt to identify and solve the problem yourself. The Gathering Problem Information section describes tools you can use to help you obtain more information about the status of the LAN Network Manager program and the events that occur while you are using the program. The sections that follow describe specific problems and suggest how to resolve them. Specifically, this chapter includes the following topics:
If you cannot solve a problem with the information in this chapter, call the IBM Technical Support Center in the United States. The phone number is 1 (800) 237-5511. Customers outside of the United States should contact their country's support center. When you call, be prepared with your customer number, your LAN Network Manager for AIX component ID, and a description of the problem. You can use the worksheet in Problem Documentation Worksheet to help you gather the necessary information. It is helpful if you can recreate the problem; otherwise, the Technical Support Center personnel will attempt to do so.
If you experience problems with the LAN Network Manager program, the following tools might help you or IBM Technical Support Center personnel identify the problem:
The cmlstatus command reports the status of each LAN Network Manager daemon configured to operate. You can use the cmlstatus command to determine the current status of the LAN Network Manager daemons, their process identifiers (PIDs), and their exit statuses. For specific information about the possible exit statuses that can be returned by the cmlstatus command, refer to the man page.
If a change that you did not expect takes place on the graphical interface or conversely if an expected change fails to take place, consult the nettl log to determine what might have happened with the resources and applications involved.
It is recommended that you use nettl logging whenever you are operating the LAN Network Manager program.
For example, if you attempt to manually discover an LNM OS/2 agent but no icon is created on the LAN Network submap, it is possible that you have not created a configuration file for that agent. If this is the case, the nettl log will contain an entry identifying the agent by its internet address and explaining the cause of the error. Knowing this, you can then create a configuration file with SMIT and rediscover the agent.
For more information about the format of the nettl log and how to view its contents, refer to Using NetView for AIX Logs.
Clearing the LAN Network Manager databases is necessary to resolve certain problems. However, before you clear the LAN Network Manager databases:
Clear the LAN Network Manager databases in the following situations:
If any of the other NetView for AIX processes are abnormally terminated, it is also recommended that you follow the previous steps before clearing the databases.
This section lists problems you might encounter using the LAN Network Manager program that are not obviously associated with a specific LAN Network Manager application. For information about problems that are clearly associated with a specific LAN Network Manager application, see later sections in this chapter.
If you attempt to manually discover an agent but an icon representing the subnet that the agent manages does not appear on the LAN submap, or the icon that represents the agent is blue, look in the nettl log for information about the problem, and ensure that the agent program is running.
If you are not receiving traps from an agent program in your network, begin by looking for a mismatch between community names and ensure you have activated the LAN Network Manager filter.
For LAN Network Manager to receive traps properly, the community name specified at the agent workstation must match the community name defined for that agent in NetView for AIX. To ensure that trap authentication is correctly configured check the following:
If they do not match, you can change the agent community name to match the default community name defined in NetView for AIX, or you can define a node-specific community name for the agent on the SNMP Configuration window in NetView for AIX.
If you are receiving agent traps but they are not being displayed in the NetView for AIX event display, you may need to activate the LAN Network Manager filter file. This filter specifies which LAN Network Manager traps are to be displayed in the event display.
To activate the LAN Network Manager filter, follow these steps:
You can deactivate the filter by selecting the filter on the Filter Control window and selecting the Deactivate push button.
If the LAN icon does not appear on the Root submap, verify the following conditions:
If these conditions are true and the LAN icon still does not appear in the Root submap, clear the LAN Network Manager databases.
If the icons of SNMP bridges in LAN Subnet submaps are blue (unknown), check the status of the lnmbrmon daemon by entering the command:
/usr/CML/bin/cmlstatus lnmbrmon
If lnmbrmon is not running, restart it by entering:
/usr/CML/bin/cmlstart lnmbrmon
Protocol switching is disabled for duplicate MAC addresses within the LAN Network Manager domain.
If you experience problems that seem to be associated with the OS/2 agent application, first follow the general problem determination suggestions in Gathering Problem Information. Other sources of information that may be helpful include the following:
Verify the connection
Use xnmbrowser of the NetView for AIX platform to query the SNMP agent on the LNM OS/2 agent workstation. In addition to defining the trap destination on the LNM OS/2 workstation, the COMMUNITYNAME must be set in CONFIG.SYS for traps to be forwarded to a NetView AIX workstation. Use netstat -s on the LNM OS/2 workstation to verify port assignment.
nettl log (see Checking the nettl Log)
Trap numbers for OS/2 agent-initiated traps correspond to DFI message numbers. Use the LAN Network Manager for OS/2 Version 2.0 documentation to resolve errors reported by traps. You will have to read the trap description to distinguish between traps with the same trap number.
Run command return codes written to the nettl log by the OS/2 agent monitor program (lnmlnmemon) usually correspond to DFI messages numbers in the LAN Network Manager for OS/2 Version 2.0 documentation. Exceptions are noted in the documentation for message 610 in Messages.
A "1" has been appended to the DFI message number for the messages displayed by the LNM OS/2 Agent application if there is a matching DFI number.
Execute cmlstatus to determine the state of the OS/2 agent monitor program. Refer to the man page for cmlstatus for a description of the exit statuses provided by cmlstatus.
Save formatted nettl logs and the trapd.log.
Save the core image or output from the dbx command.
Execute ps -ef | grep lnm to ensure that the following conditions exist:
If the problem is repeatable, tracing files are extremely useful. You can start tracing on both the lnmlnmeint and lnmlnmemon daemons.
Before starting tracing on lnmlnmemon, ensure that lnmlnmemon has started the lnmlnmeint and lnmBaseTimer processes by entering:
ps -ef | grep lnm
To obtain a complete log, turn off automatic agent discovery and execute cml_agent_found after executing the kill -31 command.
If you have defined the agent program in SMIT and created a configuration file for the agent, but the configuration file is not found, the following message is added to the nettl.log file:
Error Cannot open file = /usr/CML/conf/lnmlnmemon/<IP address of agent>.conf
In order to correctly discover LAN devices:
An adapter may show incorrect status if its status is marginal and the segment to which it is attached is resynchronized. When a segment is resynchronized, the status of congested adapters is reset to normal. Therefore, whenever a segment is automatically or manually resynchronized, any adapters that show a marginal status (yellow) are reset to normal status (green) after the segment is successfully resynchronized.
If you are monitoring an adapter and move it to another ring, the initial monitored adapter not responding trap will be processed against the adapter on the original ring. The monitor adapter responding clear trap will be processed on the new ring. A resync of the ring on which the adapter was previously located should reset the status of the adapter in the old location to unknown. Alternatively, if you know you are going to move a monitored adapter, you can set monitoring off before you move the adapter and set it back on after the adapter has been relocated.
No connections are displayed for inactive stations.
Twenty minutes have been added to the normal time-out value for this request. Successful completion of the request is recorded in /usr/OV/log/trapd.log as trap 465. Trap 465 is defined as LOGONLY. It will not display in the event window. Possible error conditions (traps 464, 466, 468, 439) will display in the event window. If the execution of the load takes more than allotted time-out value, you may receive a false time-out message.
Until a bridge has been linked, the LNM OS/2 agent does not know if the bridge has the potential to be a multiport bridge. LAN Network Manager will place a symbol on the agent submap for each undefined bridge. If it is later determined that the bridge has the potential to be a multiport bridge, LAN Network Manager will delete the symbol from the agent submap.
The status for a resource displayed by the OS/2 agent on the OS/2 agent workstation may differ from the status displayed for the resource by LAN Network Manager. For example:
If a window displays a permanent hourglass symbol, cancel the window from the system menu, then repeat the operation. If the window hangs again, cancel the window and terminate the process lnmlnmemgr. Retrying the operation will restart lnmlnmemgr. In most cases this should resolve the problem.
If commands are flowing from LAN Network Manager to an OS/2 agent and back successfully, requests will time out in the normal course of events if the OS/2 agent does not respond within the allotted time. The allotted time is determined by the number of commands already sent to the OS/2 agent for which no response has yet been received multiplied by the sum of the timeout values for LAN Network Manager and for the OS/2 agent. If a problem develops and the OS/2 agent is unable to respond, the application generates a trap stating that the OS/2 agent is not responding. If this problem continues for a long time, eventually the maximum number of requests which can be sent to the agent until a response is received will be reached. An external symptom of this problem is a permanent hourglass. Close the window using the System menu and restart the OS/2 agent that is not functioning.
If you delete an agent from SMIT or use cml_agent_remove, open management windows for that agent remain open. Close the windows that are associated with the deleted agent.
If you manage the same segment using an LNM OS/2 agent and an SNMP token-ring agent, both management applications in LAN Network Manager will respond to window requests.
Also, if you have an 8230 Model 003 or Model 013 on the ring, you may loose the capability to manage the 8230 Model 003 or Model 013 as a concentrator.
Traps generated from a token-ring segment on the other side of a transparent bridge lose their routing information at the transparent bridge. If the monitor program receives a trap from the agent that does not have the correct routing information, it will be correlated to the agent submap.
Traps received while the agent is in an unknown state may not be correlated since the view will be refreshed when the agent can be rediscovered.
Traps generated by the LNM OS/2 agent for a bridge that the LNM OS/2 Version 2 classifies as a multiport bridge when it responds to LAN BRG QUERY NAME=<bridge name> ATTR=RPT will be correlated to the agent submap and no other action will be performed.
If message 610 with return code 500 is in the nettl log for SEGMENT UTIL with the agent IP address but no segment number (for example, 9.67.167.11), the agent configuration file is corrupted. Delete the agent using the Delete LNM OS/2 Agent option in SMIT, then add the agent using SMIT and rediscover it.
This function is not supported for token-ring segments managed by the LNM OS/2 agent.
If you experience problems that seem to be associated with the SNMP Token-Ring application, follow the general problem determination suggestions in Gathering Problem Information. Next, check the following sources of information:
nettl log (see Checking the nettl Log) and trapd log
Execute cmlstatus to determine the state of the SNMP Token-Ring monitor program. Refer to the man page for cmlstatus for a description of the exit statuses provided by cmlstatus.
Save formatted nettl logs and the trapd.log.
Save the core image or output from the dbx command.
Execute ps -ef | grep lnm to ensure that the following conditions exist:
If the problem is repeatable, tracing files are extremely useful. You can start tracing on the lnmtrmon daemon.
Start tracing on lnmtrmon to log the command flow between LAN Network Manager and the SNMP Token-Ring agents.
An SNMP token-ring segment or device may run any of the following agents:
LAN Network Manager retrieves different configuration information and sends different management instructions depending on whether the agent manages a segment, station, or bridge. LAN Network Manager allows only one agent (called the primary agent) to manage a segment or device at a given time and merges the information received from the secondary and tertiary agents to provide a single view. The primary agent is selected in the following order according to the completeness of segment or device information provided:
An SNMP token-ring segment or device in LAN submaps may be incorrectly displayed for any of the following reasons:
To resolve the problem, delete the agent in one of the following ways before the age-out timer for the primary agent expires:
You can resolve this problem for 8230 concentrators with microcode at Version 5.30 or higher by enabling the RMON agent in the concentrator so that the concentrator information is merged into the Bridge submap. To change the configuration of the RMON agent, use SMIT as described in the online book Coupling and Autodiscovery.
To resolve this problem, use SMIT to reconfigure the RMON agent. To do so, enter smit cml to start SMIT and select Configure -> Configure SNMP Token-Ring application -> Configure IBM SNMP proxy agent -> RMON proxy agent.
The status of an SNMP token-ring segment with one or more 8250 bridges attached may be incorrectly displayed as marginal (yellow) when it really has normal (green) status. This is because the Token-Ring Management Module sometimes reports soft errors when the status of an 8250 bridge is normal. This is a known problem.
Token-Ring segments using token-ring surrogate agents are not discovered by LAN Network Manager unless configReportServer (CRS) and ringErrorMonitor (REM) are running. When using token-ring surrogates to manage token-ring resources, ringParameterServer (RPS) is not required.
If a window displays a permanent hourglass symbol, try to cancel the window from the system menu, then repeat the operation. If the program still hangs, stop and restart LAN Network Manager.
If you change the IP address parameter of an SNMP token-ring agent using SMIT, the change is not taken into account by the SNMP Token-Ring application until you do one of the following:
Stop and restart the SNMP Token-Ring daemon
Delete the agent, then reconfigure and rediscover it using SMIT.
If you manage the same segment using an SNMP token-ring agent and an LNM OS/2 agent, both management applications in LAN Network Manager will respond to window requests.
Also, if you have an 8230 Model 003 or Model 013 on the ring, you may loose the capability to manage the 8230 Model 003 or Model 013 as a concentrator.
If the access control settings you defined for an individual 8230 concentrator are not active, make sure that the general access control settings defined for all token-ring segments are activated.
To activate access control for all token-ring segments, select LAN -> Applications -> SNMP Token Ring -> Access Control Policy from the NetView menu bar. Then set the Access Control parameter to Active.
For more information, see the section "Defining SNMP Token-Ring Access Control Parameters" in the online book Managing SNMP Token-Ring Resources.
If SNMP token-ring stations are periodically removed from Segment submaps, check to see if access control for all token-ring segments is active and if it has been set to override the access control settings of individual segments. This sometimes occurs when another network operator changes the global default settings.
To check access control for all token-ring segments, select LAN -> Applications -> SNMP Token Ring -> Access Control Policy from the NetView menu bar.
In the SNMP Token-Ring - Access Control Policy window, note the current settings of the Access Control and Overwrite Resource-specific parameters. If necessary, you can make either of the following changes:
For more information, see the section "Defining SNMP Token-Ring Access Control Parameters" in the online book Managing SNMP Token-Ring Resources.
If you experience problems that seem to be associated with the SNMP bridge application, follow the general problem determination suggestions in Gathering Problem Information. Next, check the following sources of information:
nettl log (see Checking the nettl Log) The errors logged in the nettl log for the SNMP bridge application include SNMP bridge agent errors that can indicate possible hardware and configuration problems.
Save the core image file
MIB browser dump of the MIB
To obtain MIB browser dump of the MIB, follow these steps:
If the problem is repeatable, tracing files are extremely useful. You can start tracing on the lnmbrmon daemons.
Start tracing on lnmbrmon to log the command flow between LAN Network Manager and the SNMP bridge agents.
If you are experiencing problems related to the discovery of SNMP bridges, the following procedures might help:
Execute cmlstatus to ensure that lnmtopod and lnmbrmon are up and running.
Check the nettl log to see if there is a message with the IP address of the bridge and an explanation of the error.
If a bridge is in an undiscovered subnet, follow these steps:
You must have a network connection to the bridge before it can be discovered. If you cannot ping the bridge, the SNMP session to the bridge cannot be established.
If you get a timeout, one of three things might be the problem:
You can get a timeout if the wrong community name is specified. LNM for AIX gets the community name from NetView for AIX. To change the community name in NetView for AIX, select SNMP Configuration from the Options pull-down menu. If the SNMP bridge agent has a community name other than public you must define the IPAddress and the community name to NetView for AIX before LNM for AIX can discover the bridge. See the section "Configuring General Parameters for SNMP Agents" in the online book Managing SNMP Token-Ring Resources and SNMP Bridges for more information.
If you can get these attributes, the bridge agent is working. If you cannot get bridge attributes, change you bridge installation (hardware, installation, microcode levels, configuration, etc..)
Ensure that RFC 1286 must be in 1.3.6.1.2.1.17. The SNMP Bridge application will not discover SNMP bridges that implement RFC 1286 in a private branch.
In order for a RouteXpander/2 bridge to be managed by LAN Network Manager, it must be configured with two IP addresses and the protocol must be configured in LAPS. For information on how to use LAPS to do this, refer to the RouteXpander/2 documentation.
Also, when using RouteXpander/2 bridges in your network, it is recommended that you increase the SNMP time-out parameter to 10 seconds or more. To change this value, select the SNMP Configuration option from the Options pull-down menu on the NetView for AIX menu bar.
In order for an SNMP-managed 8227 bridge to be correctly displayed, the bridge must be Version 1.01 or higher.
Also, LAN Network Manager has a restriction that the Ethernet port is always drawn between the 10baseT port area and the AUI port. Full bridge port management is still available for the Ethernet port.
In order for an SNMP-managed 8229 bridge to be correctly displayed, the bridge must be one of the following versions:
Also, these versions of the 8229 bridge have a known problem that causes the ring utilization parameter to be always reported as 0.
The following limitations apply to the display of bridge ports and bridges attached to the 8271 Model 001 Switch:
The following limitations apply to the display of bridge ports on the 8272 Switch:
In submaps of 8281 bridges, LAN Network Manager has a restriction that the Ethernet AUI port is always drawn in the 10base-T port area. Full bridge port management is still available for the AUI port, but the icon is not placed above the AUI port on the submap.
Also, 8281 modules currently report the same sysOid as 8281 standalone bridges. As a result, LAN Network Manager discovers the 8281 module and displays it as a standalone bridge. This is a known problem.
LAN Network Manager currently has a problem obtaining correct values from the SNMP agent in SynOptics bridges (Model 3522 with microcode version 2.2 and boot code version 2.1).
LAN Network Manager issues a single get request with multiple OIDs, rather than multiple gets with a single OID. However, when LAN Network Manager makes a get request having more than one OID, the SynOptics bridge SNMP agent does not always return the correct values for the OIDs requested.
For example, when the LAN Network Manager issues a get request with the OIDs 1.3.6.1.2.1.17.2.15.1.9.1 and 1.3.6.1.2.1.17.2.15.1.9.2, one of the values that the SNMP agent in the SynOptics bridge returns overwrites the other. This causes LAN Network Manager to incorrectly display the segment connectivity of the bridge.
If you experience problems that seem to be associated with the FDDI application, follow the general problem determination suggestions in Gathering Problem Information. Next, check the following sources of information:
nettl log (see Checking the nettl Log) and trapd log
Execute cmlstatus to determine the state of the FDDI application. Refer to the man page for cmlstatus for a description of the exit statuses provided by cmlstatus.
Save formatted nettl logs and the trapd.log.
Save the core image or output from the dbx command.
Execute ps -ef | grep lnm to ensure that the following conditions exist:
MIB browser dump of the MIB. If a station does not show all of its sub-objects (for example, path, path class, attachment), look at a MIB browser dump of the MIB.
To obtain MIB browser dump of the MIB:
When using the FDDI SNMP proxy agent in your network, it is recommended that you increase the SNMP time-out parameter to 10 seconds or more. To change this value, select the SNMP Configuration option from the Options pull-down menu on the NetView for AIX menu bar.
Also, the FDDI SNMP proxy agent Version 6.0 incorrectly reports private.enterprises.ibm.ibmArchitecture.fddi.fddismt73ext.snmpFddiConfig as 5.9 and LAN Network Manager displays this information.
If you experience problems integrating Hub Manager with LAN Network Manager, follow the general problem determination suggestions in Gathering Problem Information. Next, check the following sources of information:
Ensure that you have defined Hub Manager as an application to be started on the SMIT Applications to be Started When LNM for AIX Starts menu.
Verify that NetView for AIX has discovered the hub.
Ensure that lnmhubint is running using the cmlstatus command.
Ensure that iubd is running using the ovstatus command.
nettl log (see Checking the nettl Log) and trapd log.
Execute cmlstatus to determine the state of the 8250, 8260, and 8265 Device Manager application. Refer to the man page for cmlstatus for a description of the exit statuses provided by cmlstatus.
Save formatted nettl logs.
Save the core image or output from the dbx command.
Execute ps -ef | grep lnm to ensure that the following conditions exist:
Use the following worksheet to gather information about a LAN Network Manager problem. If you call the IBM Technical Support Center for assistance, this information can be helpful.
Customer number |
|
Nways Campus Manager LAN for AIX component ID |
|
Problem symptoms |
AIX operating system |
|
Motif and X11 |
|
NetView for AIX |
|
Nways Campus Manager LAN for AIX |
|
Agent programs possibly involved in the problem |
|
|
Amount of memory (RAM) installed |
|
Amount of paging space available |
|
Amount of free space available in the file system that contains /usr/OV |
|
Amount of free space available in /tmp |
|
Which NetView for AIX applications were running at the time of the problem? |
|
Which mode was NetView for AIX operating in at the time of the problem? | &ballot. Read &ballot. Read-Write |
What is the size of the network you are managing?
|
|
Have the following information available when you call the IBM Technical Support center:
cmlstatus
Issue the cmlstatus command and direct the output to a file, which you can print and have available when working with the IBM Technical Support Center. To create the output file, enter the following command at an AIX command line:
cmlstatus > cmlstatus.output
You can then print the cmlstatus.output file.
See Displaying LAN Network Manager Status Information for more information about the cmlstatus command.
Log Files
To ensure that you have a record of events that occurred when the problem arose, save the following log files:
See Using NetView for AIX Logs for more information about LAN Network Manager logging.
Core Image
A possible additional source of information about your problem is the core image that might have been created for an LAN Network Manager executable. Perform the following steps to locate and save the core image:
/usr/bin/od -c core 3274 | head